Conversation
There was a problem hiding this comment.
Pull request overview
Adds three new Python-based DocumentDB OSS sample applications to the samples gallery and registers them in registry.yml.
Changes:
- Register 3 new samples in
registry.yml(fraud detection multi-agent CLI, content semantic search Flask portal, clinical note similarity Flask portal). - Add
fraud-detection-agent-pysample (data seeding, vector search retrieval agent, LLM analysis/decision agents). - Add
content-semantic-search-pyandclinical-note-similarity-pyFlask samples (ingestion/upload scripts, search + detail pages, styling, sample datasets).
Reviewed changes
Copilot reviewed 39 out of 43 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| registry.yml | Adds three new sample entries to the gallery registry. |
| fraud-detection-agent-py/utils/embeddings.py | Ollama embeddings helper for the fraud detection sample. |
| fraud-detection-agent-py/utils/db.py | Mongo/DocumentDB client + collection helpers for fraud detection sample. |
| fraud-detection-agent-py/utils/init.py | Package marker for fraud sample utils. |
| fraud-detection-agent-py/upload_data.py | Seeds transactions, generates embeddings, creates vector index. |
| fraud-detection-agent-py/requirements.txt | Python dependencies for fraud sample. |
| fraud-detection-agent-py/README.md | Documentation for running the fraud multi-agent pipeline. |
| fraud-detection-agent-py/main.py | CLI entry point to run sample transactions through agents. |
| fraud-detection-agent-py/data/transactions.json | Sample labeled transaction dataset for vector retrieval. |
| fraud-detection-agent-py/cleanup.py | Drops the fraud sample collection. |
| fraud-detection-agent-py/agents/retrieval_agent.py | Vector-search retrieval agent implementation. |
| fraud-detection-agent-py/agents/decision_agent.py | LLM decision agent implementation. |
| fraud-detection-agent-py/agents/analysis_agent.py | LLM analysis agent implementation. |
| fraud-detection-agent-py/agents/init.py | Package marker for fraud sample agents. |
| fraud-detection-agent-py/.gitignore | Ignores env/venv/build artifacts for fraud sample. |
| fraud-detection-agent-py/.env.example | Example env vars for fraud sample. |
| content-semantic-search-py/utils/embeddings.py | Ollama embeddings helper for semantic search sample. |
| content-semantic-search-py/utils/db.py | Mongo/DocumentDB client + collection helpers for semantic search sample. |
| content-semantic-search-py/utils/init.py | Package marker for semantic search utils. |
| content-semantic-search-py/templates/index.html | Search UI template for semantic search portal. |
| content-semantic-search-py/templates/article.html | Article detail template for semantic search portal. |
| content-semantic-search-py/static/style.css | Styling for semantic search portal. |
| content-semantic-search-py/requirements.txt | Python dependencies for semantic search sample. |
| content-semantic-search-py/README.md | Documentation for semantic search portal + ingestion options. |
| content-semantic-search-py/ingest.py | Ingestion script for sample/custom text/PDF documents + vector index. |
| content-semantic-search-py/data/articles.json | Sample content dataset for semantic search. |
| content-semantic-search-py/app.py | Flask app implementing semantic search + detail routes. |
| content-semantic-search-py/.gitignore | Ignores env/venv/build artifacts and uploads folder. |
| content-semantic-search-py/.env.example | Example env vars for semantic search sample. |
| clinical-note-similarity-py/utils/embeddings.py | Ollama embeddings helper for clinical notes sample. |
| clinical-note-similarity-py/utils/db.py | Mongo/DocumentDB client + collection helpers for clinical notes sample. |
| clinical-note-similarity-py/utils/init.py | Package marker for clinical notes utils. |
| clinical-note-similarity-py/upload_notes.py | Seeds clinical notes, generates embeddings, creates vector index. |
| clinical-note-similarity-py/templates/note.html | Note detail template for clinical notes portal. |
| clinical-note-similarity-py/templates/index.html | Search UI template for clinical notes portal. |
| clinical-note-similarity-py/static/style.css | Styling for clinical notes portal. |
| clinical-note-similarity-py/requirements.txt | Python dependencies for clinical notes sample. |
| clinical-note-similarity-py/README.md | Documentation + disclaimer for clinical notes similarity explorer. |
| clinical-note-similarity-py/data/clinical_notes.json | Fictional clinical note dataset for similarity search. |
| clinical-note-similarity-py/cleanup.py | Drops the clinical notes collection. |
| clinical-note-similarity-py/app.py | Flask app implementing similarity search + note detail routes. |
| clinical-note-similarity-py/.gitignore | Ignores env/venv/build artifacts for clinical notes sample. |
| clinical-note-similarity-py/.env.example | Example env vars for clinical notes sample. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def get_client() -> MongoClient: | ||
| uri = os.environ["DOCUMENTDB_URI"] | ||
| return MongoClient(uri, tlsAllowInvalidCertificates=True) |
There was a problem hiding this comment.
@copilot apply changes based on this feedback
| print(f"Starting Clinical Note Similarity Explorer on http://localhost:{port}") | ||
| app.run(debug=True, port=port) |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds three new Python sample projects to the DocumentDB samples gallery and updates shared documentation to reflect recommended local-dev and vector-search usage patterns.
Changes:
- Registered 3 new Python samples in
registry.yml(fraud detection multi-agent, content semantic search portal, clinical note similarity explorer). - Added full sample implementations (CLI + Flask apps), including seed/ingest scripts and sample datasets.
- Expanded
SKILL.mdwith Docker Compose guidance, safer Python client patterns, and vector search tuning notes.
Reviewed changes
Copilot reviewed 40 out of 44 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| SKILL.md | Updates the main skill doc with Docker Compose guidance, safer Python connection pattern, and vector search tuning sections. |
| registry.yml | Adds 3 new sample entries to the gallery registry. |
| fraud-detection-agent-py/utils/embeddings.py | Ollama embedding helper for the fraud detection sample. |
| fraud-detection-agent-py/utils/db.py | DocumentDB (PyMongo) client/collection helpers for the fraud detection sample. |
| fraud-detection-agent-py/utils/init.py | Package marker for fraud sample utilities. |
| fraud-detection-agent-py/upload_data.py | Seeds fraud sample data, generates embeddings, and creates vector index. |
| fraud-detection-agent-py/requirements.txt | Python dependencies for the fraud detection sample. |
| fraud-detection-agent-py/README.md | Setup and usage documentation for the fraud detection sample. |
| fraud-detection-agent-py/main.py | Runs the fraud multi-agent pipeline over example transactions. |
| fraud-detection-agent-py/data/transactions.json | Sample transaction dataset with fraud labels for the fraud sample. |
| fraud-detection-agent-py/cleanup.py | Convenience script to drop the fraud sample collection. |
| fraud-detection-agent-py/agents/retrieval_agent.py | Retrieval agent implementation using DocumentDB vector search. |
| fraud-detection-agent-py/agents/decision_agent.py | Decision agent implementation using Ollama chat endpoint. |
| fraud-detection-agent-py/agents/analysis_agent.py | Analysis agent implementation using Ollama chat endpoint. |
| fraud-detection-agent-py/agents/init.py | Package marker for fraud sample agents. |
| fraud-detection-agent-py/.gitignore | Ignores env/venv/build artifacts for the fraud sample. |
| fraud-detection-agent-py/.env.example | Example environment configuration for the fraud sample. |
| content-semantic-search-py/utils/embeddings.py | Ollama embedding helper for the content semantic search sample. |
| content-semantic-search-py/utils/db.py | DocumentDB (PyMongo) client/collection helpers with safer TLS flag handling. |
| content-semantic-search-py/utils/init.py | Package marker for content sample utilities. |
| content-semantic-search-py/templates/index.html | Search UI template for the content semantic search portal. |
| content-semantic-search-py/templates/article.html | Article detail UI template for the content semantic search portal. |
| content-semantic-search-py/static/style.css | Styling for the content semantic search portal UI. |
| content-semantic-search-py/requirements.txt | Python dependencies for the content semantic search sample. |
| content-semantic-search-py/README.md | Setup and usage documentation for the content semantic search sample. |
| content-semantic-search-py/ingest.py | Ingests sample/custom text/PDF content, embeds, and creates vector index. |
| content-semantic-search-py/data/articles.json | Sample articles dataset for content semantic search. |
| content-semantic-search-py/app.py | Flask app implementing semantic search and article detail endpoints. |
| content-semantic-search-py/.gitignore | Ignores env/venv/build artifacts (and uploads dir) for the content sample. |
| content-semantic-search-py/.env.example | Example environment configuration for the content semantic search sample. |
| clinical-note-similarity-py/utils/embeddings.py | Ollama embedding helper for the clinical note similarity sample. |
| clinical-note-similarity-py/utils/db.py | DocumentDB (PyMongo) client/collection helpers with safer TLS flag handling. |
| clinical-note-similarity-py/utils/init.py | Package marker for clinical sample utilities. |
| clinical-note-similarity-py/upload_notes.py | Seeds clinical notes, generates embeddings, and creates vector index. |
| clinical-note-similarity-py/templates/note.html | Full-note UI template for the clinical similarity explorer. |
| clinical-note-similarity-py/templates/index.html | Search UI template for the clinical similarity explorer. |
| clinical-note-similarity-py/static/style.css | Styling for the clinical similarity explorer UI. |
| clinical-note-similarity-py/requirements.txt | Python dependencies for the clinical note similarity sample. |
| clinical-note-similarity-py/README.md | Setup and usage documentation for the clinical note similarity sample. |
| clinical-note-similarity-py/data/clinical_notes.json | Fictional, de-identified clinical notes dataset for the clinical sample. |
| clinical-note-similarity-py/cleanup.py | Convenience script to drop the clinical notes collection. |
| clinical-note-similarity-py/app.py | Flask app implementing similarity search and note detail endpoints. |
| clinical-note-similarity-py/.gitignore | Ignores env/venv/build artifacts for the clinical sample. |
| clinical-note-similarity-py/.env.example | Example environment configuration for the clinical note similarity sample. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def get_client() -> MongoClient: | ||
| uri = os.environ["DOCUMENTDB_URI"] | ||
| return MongoClient(uri, tlsAllowInvalidCertificates=True) | ||
|
|
There was a problem hiding this comment.
get_client() will raise a raw KeyError if DOCUMENTDB_URI is missing (via os.environ[...]), and it unconditionally sets tlsAllowInvalidCertificates=True. For samples, prefer os.getenv with a clear exit message when the URI is missing, and only enable invalid certs when an explicit env flag is set (or when connecting to the local dev container) to avoid hard-coding insecure defaults.
| query = request.form.get("query", "").strip() | ||
| specialty = request.form.get("specialty", "all") | ||
| num_results = int(request.form.get("num_results", 5)) | ||
| specialties = get_specialties() | ||
|
|
There was a problem hiding this comment.
num_results = int(request.form.get('num_results', 5)) can raise ValueError if the request is tampered with (or the field is missing/empty), causing a 500. Consider parsing num_results with a try/except and clamping to a small set of allowed values (or at least >= 1) similar to the handling in content-semantic-search-py/app.py.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
No description provided.